Analyzing the New York State of Health

Author

Vincent Liu, Marco Mendoza

Published

April 26, 2025

Introductions

The visualizations on this page are built using real-world data from the New York Statewide Planning and Research Cooperative System (SPARCS). Managed by the New York State Department of Health, SPARCS collects detailed information about hospital discharges, patient characteristics, diagnoses, treatments, and charges from hospitals and clinics across New York State.

As we worked through this large dataset (covering over 1.19 million records) we discovered some really interesting patterns across different hospitals and procedures. Along the way, we explored how hospitals total discharges change over time, how costs relate to discharges, and what broader trends start to emerge when you take a step back and look at the data as a whole.

This project is all about sharing those insights and helping make sense of the numbers behind healthcare in New York.

Note: We used both R and Tableau to create the following visualizations. Allowing us to mix both interactive and static visualizations.


Data Cleaning

#Loading Data frame
hospital_df <- vroom("Hospital_Inpatient_Discharges__SPARCS_De-Identified___Cost_Transparency__Beginning_2009_20250426.csv",
                     col_types = cols(Discharges = col_character())) 

#Cleaning Data frame
hospital_df$Discharges <- as.numeric(gsub(",", "", hospital_df$Discharges)) # change discharge column to read commas as numbers
hospital_df <- janitor::clean_names(hospital_df) # clean names to lowercase with _ as spaces
hospital_df <- na.omit(hospital_df) # remove any rows with null values

Visualizations

Visualization 1: Total Discharges Each Year by Facility

Analysis

This line plot shows total discharges by facility from 2010 to 2021. Although the lines initially seem overwhelming, the interactive features make it easy to isolate individual hospitals. Facilities near the top maintain consistently high discharge volumes, reflecting their roles as major healthcare providers. Most hospitals show stable trends over time, though some display notable fluctuations, hinting at operational changes or shifting patient demand.

Visualization 2: Top 15 Highest Charge-Gap Facilities

Analysis

This bar chart ranks hospitals by their Charge-Cost Gap (Mean Charge - Mean Cost), which calculates a profit-like margin per discharge. Facilities at the top of the chart show the greatest financial margin between cost and billing. This analysis highlights disparities in hospital pricing behavior. It flags hospitals where patients may be billed substantially more than the cost to provide services, raising important considerations about pricing transparency and hospital profitability. As we can see Westchester Medical Center throughout the years have overcharged 1.5 billion to patients.

Visualization 3: Average Charges and Patient Volume

Analysis

This graph presents a line chart tracking Average Mean Charge and Discharges over time from 2008 to 2022, filtered for a specific hospital. It allows us to assess whether patient volumes and hospital charges have increased, decreased, or shifted independently. By analyzing the trends, we observe that while patient discharges may decline in certain years (such as during the COVID-19 pandemic), the average charges per patient can still increase, indicating that rising healthcare costs are not solely driven by patient volume.

Visualization 4: Average Costs and Charges based on Severity Level

Analysis

This bar chart compares the Average Mean Cost and Average Mean Charge across different severity levels — Minor, Moderate, Major, and Extreme.

The visualization clearly demonstrates that both the cost of care and the prices charged to patients increase progressively as severity worsens. However, it also shows that the rate of increase in charges often outpaces the increase in actual cost, suggesting growing hospital margins for higher-severity patients. This supports deeper discussions around healthcare billing and fairness.


Conclusions

Through these four visualizations, we addressed key data exploration questions posed by the New York State Department of Health. We revealed important trends in healthcare charges and costs over time, identified facilities with significant financial gaps, and demonstrated that care costs scale with patient severity.

These insights help paint a clearer picture of hospital operations across New York State and offer valuable starting points for policy discussions, healthcare cost reforms, and future predictive modeling efforts.